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SEQUENCE LISTING 



(1) GENERAL INFORMATION: 

(i) APPLICANT: Altieri, Dario C. 

(ii) TITLE OF INVENTION: SURVIVIN, A PROTEIN THAT INHIBITS 
CELLULAR APOPTOSIS, AND ITS MODULATION 

(iii) NUMBER OF SEQUENCES: 35 

(iv) CORRESPONDENCE ADDRESS: 

(A) ADDRESSEE: MORGAN, LEWIS & BOCKIUS LLP 

(B) STREET: 1800 M Street, N.W. 

(C) CITY: Washington 

(D) STATE: D.C. 

(E) COUNTRY: USA 

(F) ZIP: 20036-5869 

(v) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy disk 

(B) COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: PC-DOS/MS-DOS 

(D) SOFTWARE: Patentln Release #1.0, Version #1.30 

(vi) CURRENT APPLICATION DATA: 

(A) APPLICATION NUMBER: US 08/975,080 

(B) FILING DATE: 20-NOV-1997 

(vii) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: US 60/031,435 

(B) FILING DATE: 20-NOV-1996 

(viii) ATTORNEY/AGENT INFORMATION: 

(A) NAME: Adler, Reid G. 

(B) REGISTRATION NUMBER: 30,988 

(C) REFERENCE/DOCKET NUMBER: 044 574-5022-01-WO 

(ix) TELECOMMUNICATION INFORMATION: 

(A) TELEPHONE: 202-467-7000 

(B) TELEFAX: 202-467-7176 



(2) INFORMATION FOR SEQ ID NO:l: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 

(A) DESCRIPTION: /desc = "oligonucleotide" 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:l: 




TGCTGGCCGC TCcTCCCTC 
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19 



(2) INFORMATION FOR SEQ ID NO: 2: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 

(A) DESCRIPTION: /desc = "oligonucleotide" 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:2: 
ATGACCTCCA GAGGTTTC 18 
(2) INFORMATION FOR SEQ ID NO: 3: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 17 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 

Ala Pro Thr Leu Pro Pro Ala Trp Gin Pro Phe Leu Lys Asp His Arg 
1 5 10 15 

He 



£2) INFORMATION FOR SEQ ID NO: 4: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4: 

Glu Gly Trp Glu Pro Asp Asp Asp Pro He Glu Glu His Lys Lys His 

1 5 . . 10 -15 

Ser Ser Gly Cys 
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(2) INFORMATION FOR SEQ ID NO: 5: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 27 base pairs 

(B) TYPE: nucleic acid 
■(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5 
GCGGGTGAGC TGTCCCTTGC AGATGGC 
(2) INFORMATION FOR SEQ ID NO: 6: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 27 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6 
CCATGTAAGT TGATTTTTCT AGAGAGG 
(2) INFORMATION FOR. SEQ ID NO: 7: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 27 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7 
AATTGTATGT CTTTATTTCC AGGCAAA 
(2) INFORMATION FOR SEQ ID NO: 8: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 5 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO:8: 

Glu Glu Ala Arg Leu Val Thr Phe Gin Asn Trp Pro Asp Ala Phe Leu 
15 10 15 

Thr Pro Gin Glu Leu Ala Lys Ala Gly Phe Tyr Tyr Leu Gly Arg Gly 
20 25 30 

Asp Gin Val Gin Cys Phe Ala Cys Gly Gly Lys Leu Ala 
35 40 45 

(2) INFORMATION FOR SEQ ID NO: 9: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 45 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9: 

Glu Glu Ala Arg Phe Leu Thr Tyr Ser Met Trp Pro Leu Ser Phe Leu 
1 5 .10 15 

Ser Pro Ala Glu Leu Ala Arg Ala Gly Phe Tyr Tyr lie Gly Pro Gly 
20 25 30 

Asp Arg Val Ala Cys Phe Ala Cys Gly Gly Lys Leu Ser 
35 40 45 

(2) INFORMATION FOR SEQ ID NO: 10: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 45 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 

Glu Ala Asn Arg Leu Val Thr Phe Lys Asp Trp Pro Asn Pro Asn He 
15 10 15 

Thr Pro Gin Ala Leu Ala Lys Ala Gly Phe Tyr Tyr Leu Asn Arg Leu 
20 25 30 



Asp His Val Lys Cys Val Trp Cys Asn Gly Val He Ala 
35 40 45 
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(2) INFORMATION FOR SEQ ID NO : 1 1 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 45 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 

Glu Glu Val Arg Leu Asn Thr Phe Glu Lys Trp Pro Val Ser Phe Leu 
1 5 10 15 

Ser Pro Glu Thr Met Ala Lys Asn Gly Phe Tyr Tyr Leu Gly Arg Ser 
20 25 30 

Asp Glu Val Arg Cys Ala Phe Cys Lys Vai Glu He Met 
35 40 45 

(2) INFORMATION FOR SEQ ID NO: 12: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 45 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 

Lys Ala Ala Arg Leu Gly Thr Tyr Thr Asn Trp Pro Val Gin Phe Leu 
15 10 15 

Glu Pro Ser Arg Met Ala Ala Ser Gly Phe Tyr Tyr Leu Gly Arg Gly 
20 25 30 

Asp Glu Val Arg Cys Ala Phe Cys Lys Val Glu He Thr 
35 40 45 

(2) INFORMATION FOR SEQ ID NO: 13: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13: 

Glu Glu Ala Arg Leu Ala Ser Phe Arg Asn Trp Pro Phe Tyr Val Gin 
1 5 10 15 

Gly He Ser Pro Cys Val Leu Ser Glu Ala Gly Phe Val Phe Thr Gly 
20 25 30 

Lys Gin Asp Thr Val Gin Cys Phe Ser Cys Gly Gly Cys Leu Gly 
35 40 45 

(2) INFORMATION FOR SEQ ID NO: 14: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 45 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14: 

Glu Ala Asn Arg Leu Val Thr Phe Lys Asp Trp Pro Asn Pro Asn He 
15 10 15 

Thr Pro Gin Ala Leu Ala Lys Ala Gly Phe Tyr Tyr Leu Asn Arg Leu 
20 25 30 

Asp His Val Lys Cys Val Trp Cys Asn Gly Val He Ala 
35 40 45 

(2) INFORMATION FOR SEQ ID NO: 15: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 6 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 15: 

Glu Glu Ala Arg Leu Lys Ser Phe Gin Asn Trp Pro Asp Tyr Ala His 
15 10 15 

Leu Thr Pro Arg Glu Leu Ala Ser Ala Gly Leu Tyr Tyr Thr Gly He 
20 25 30 

Gly Asp Gin Val Gin Cys Phe Cys Cys Gly Gly Lys Leu Lys 
35 40 45 
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(2) INFORMATION FOR SEQ ID NO: 16: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 6 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY : linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 16: 

Glu Glu Ala Arg Leu Lys Ser Phe Gin Asn Trp Pro Asp Tyr Ala His 
15 10 15 

Leu Thr Pro Arg Glu Leu Ala Ser Ala Gly Leu Tyr Tyr Thr Gly Ala 
20 25 30 

Asp Asp Gin Val Gin Cys Phe Cys Cys Gly Gly Lys Leu Glu 
35 40 45 

(2) INFORMATION FOR SEQ ID NO: 17: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 5 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

{ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17: 

Glu Asn Ala Arg Leu Leu Thr Phe Gin Thr Trp Pro Leu Thr Phe Leu 
15 10 15 

Ser Pro Thr Asp Leu Ala Arg Ala Gly Phe Tyr Tyr Thr Gly Pro Gly 
20 25 30 

Asp Arg Val Ala Cys Phe Ala Cys Gly Gly Lys Leu Ser 
35 40 45 

(2) INFORMATION FOR SEQ ID NO: 18: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 5 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
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(xi) SEANCE DESCRIPTION: SEQ ID NO: 18: 



Glu Glu Ala Arg Phe Leu Thr Tyr His Met Trp Pro Leu Thr Phe Leu 
1 5 10 15 

Ser Pro Ser Glu Leu Ala Arg Ala Gly Phe Tyr Tyr He Gly Pro Gly 
20 25 30 

Asp Arg Val Ala Cys Phe Ala Cys Gly Gly Lys Leu Ser 
35 40 45 

) INFORMATION FOR SEQ ID NO: 19: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 6 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 19: 

Glu Glu Ala Arg Leu Lys Ser Phe Gin Asn Trp Pro Asp Tyr Ala His 
15 10 15 

Leu Thr Pro Arg Glu Leu Ala Ser Ala Gly Leu Tyr Tyr Thr Gly He 
20 25 30 

Gly Asp Gin Val Gin Cys Phe Cys Cys Gly Gly Lys Leu Lys 
35 40 45 

) INFORMATION FOR SEQ ID NO: 20: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 45 amino acids 

(B) TYPE: amino acid 
<C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:20: 

Glu Ala Asn Arg Leu Val Thr Phe Lys Asp Trp Pro Asn Pro Asn He 

15 10 15 

Thr Pro Gin Ala Leu Ala Lys Ala Gly Phe Tyr Tyr Leu Asn Arg Leu 

20 25 30 

Asp His Val Lys Cys Val Trp Cys Asn Gly Val He Ala 

35 40 45 



) INFORMATION FOR SEQ ID NO: 21: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 50 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

( D) TOPOLOGY: iinear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:21: 

Tyr Val Gly lie Gly Asp Lys Val Lys Cys Phe His Cys Asp Gly Gly 
15 10 15 

Leu Arg Asp Trp Glu Pro Gly Asp Asp Pro Trp Glu Glu His Ala Lys 
20 25 30 

Trp Phe Pro Arg Cys Glu Phe Leu Leu Leu Ala Lys Gly Gin Glu Tyr 
35 40 45 

Val Ser 
50 

(2) INFORMATION FOR SEQ ID NO:22: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 50 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 22: 

Tyr Val Asp Arg Asn Asp Asp Val Lys Cys Phe Cys Cys Asp Gly Gly 
1 5 10 '15 

Leu Arg Cys Trp Glu Pro Gly Asp Asp Pro Trp lie Glu His Ala Lys 
20 25 30 

Trp Phe Pro Arg Cys Glu Phe Leu He Arg Met Lys Gly Gin Glu Phe 
35 40 45 

Val Asp 
50 

(2) INFORMATION FOR SEQ ID NO: 23: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 50 amino acids 
(3) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: protein 
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{xi) SEQUENCE DESCRIPTION: SEQ ID NO:23: 

Tyr Gin Lys lie Gly Asp Gin Val Arg Cys Phe His Cys Asn He Gly 
15 10 15 

Leu Arg Ser Trp Gin Lys Glu Asp Glu Pro Trp Phe Glu His Ala Lys 
20 25 30 

Trp Ser Pro Lys Cys Gin Phe Val Leu Leu Ala Lys Gly Pro Ala Tyr 
35 40 45 

Val Ser 
50 

INFORMATION FOR SEQ ID NO: 24: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 49 amino acids 
<B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 24: 

Tyr Thr Gly Tyr Gly Asp Asn Thr Lys Cys Phe Tyr Cys Asp Gly Gly 
15 10 15 

Leu Lys Asp Trp Glu Pro Glu Asp Val Pro Trp Glu Gin His Val Arg 
20 25 30 

Trp Phe Asp Arg Cys Ala Tyr Val Gin Leu Val Lys Gly Arg Asp Tyr 
35 40 45 

Val 



INFORMATION FOR SEQ ID NO: 25: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 49 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 25: 

Tyr Thr Gly Gin Gly Asp Lys Thr Arg Cys Phe Cys Cys Asp Gly Gly 
1- 5 10 15 
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Leu Lys Asp Trp Glu Pro Asp Asp Ala Pro Trp Gin Gin His Ala Arg 
20 25 30 

Trp Tyr Asp Arg Cys Glu Tyr Val Leu Leu Val Lys Gly Arg Asp Phe 
35 40 45 

Val 



INFORMATION FOR SEQ ID NO:26: 

<i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 50 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:26: 

Tyr Thr Gly He Lys Asp He Val Gin Cys Phe Ser Cys Gly Gly Cys 
1 5 10 15 

Leu Glu Lys Trp Gin Glu Gly Asp Asp Pro Leu Asp Asp His Thr Arg 
20 25 30 

Cys Phe Pro Asn Cys Pro Phe Leu Gin Asn Met Lys Ser Ser Ala Glu 
35 40 45 

Val Thr 
50 

INFORMATION FOR SEQ ID NO: 27: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 50 amino acids 

(B) TYPE: amino acid 
<C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 27: 

Tyr Gin Lys He Gly Asp Gin Val Arg Cys Phe His Cys Asn lie- Gly 
15 10 15 

Leu Arg Ser Trp Gin Lys Glu Asp Glu Pro Trp Phe Glu His Ala Lys 
20 25 30 



Trp Ser Pro Lys Cys Gin Phe Val Leu Leu Ala Lys Gly Pro Ser Tyr 
35 40 45 



# • 

Val Ser^^ 

50 



(2) INFORMATION FOR SEQ ID NO: 28: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 50 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:28: 

Ala Leu Gly Glu Gly Asp Lys Val Lys Cys Phe His Cys Gly Gly Gly 
15 10 15 

Leu Thr Asp Trp Lys Pro Ser Glu Asp Pro Trp Glu Gin His Ala Lys 
20 25 30 

Trp Tyr Pro Gly Cys Lys Tyr Leu Leu Glu Gin Lys Gly Gin Glu Tyr 
35 40 45 

lie Asn 
50 

(2) INFORMATION FOR SEQ ID NO: 29: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 50 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 29: 

Ala Leu Gly Glu Gly Asp Lys Val Lys Cys Phe His Cys Gly Gly Gly 
15 10 15 

Leu Thr Asp Trp Lys Pro Ser Glu Asp Pro Trp Glu Gin His Ala Lys 
20 25 30 

Trp Tyr Pro Gly Cys Lys Tyr Leu Leu Asp Glu Lys Gly Gin Glu Tyr 
35 40 ' 45 

lie Asn 
50 

(2) INFORMATION FOR SEQ ID NO: 30: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 50 amino acids 
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TYPE : amino acid 
C STRANDEDNESS: 
%\ TOPOLOGY: linear 

u) MOLECULE TYPE: protein 



(xll SEQUENCE OESCK^O,. » ^ ^ ^ ^ ^ ^ 



r" 1 ^^ 5 r o TrpValGlnHisMa^ 
1 r1v Asp Asp Pro Trp 3Q 

Leu Arg Cvs «P Glu Ser Gly 25 



20 

* „ Cvs Glu Tyr Leu He Arg 4 5 
Trp Phe Pro Arg Cys 4Q 

lie Arg 
50 

/ „m cnR SEQ ID NO: 31: 

(2) INFORMATION FOR SEU 

* TYPE : amino acid 
r STRANDEDNESS: 
S TOPOLOGY: linear 

(U) MOLECULE TYPE: protein 



i rw Arg Asn Asp Asp lQ 
1 cy3 Ttp S.. "V » »» '» ™ 

„. «, - ». - «» 

TO «- ... sw **• « 

* 35 

Val Asp 
50 



B TYPE: amino acid 
\l\ STRANDEDNESS: 
S TOPOLOGY: linear 

U) MOLECULE TYPE: protein 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 32: 

Ala Leu Gly Glu Gly Asp Lys Val Lys Cys Phe His Cys Gly Gly Gly 
15 10 15 

Leu Thr Asp Trp Lys Pro Ser Glu Asp. Pro Trp Glu Gin His Ala Lys 
20 25 30 

Trp Tyr Pro Gly Cys Lys Tyr Leu Leu Glu Gin Lys Gly Gin Glu Tyr 
35 40 45 

lie Asn 
50 

(2) INFORMATION FOR SEQ ID NO: 33: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 50 amino acids 

(B) TYPE: amino acid 

(C) STRANDEQNESS: 

( D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 33: 

Tyr Gin Lys lie Gly Asp Gin Val Arg Cys Phe His Cys Asn He Gly 
15 10 15 

Leu Arg Ser Trp Gin Lys Glu Asp Glu Pro Trp Phe Glu His Ala Lys 
20 25 30 

Trp Ser Pro Lys Cys Gin Phe Val Leu Leu Ala Lys Gly Pro Ala Tyr 
35 40 • 45 

Val Ser 
50 

(2) INFORMATION FOR SEQ ID NO: 34: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 142 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 34: 

Met Gly Ala Pro Thr Leu Pro Pro Ala Trp Gin Pro Phe Leu Lys Asp 
1 5 10 15 
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His Arg lie Ser Thr Phe Lys Asn Trp Pro Phe Leu Giu Giy Cys Aia 
20 25 30 

Cys Thr Pro Giu Arg Met Aia Giu Aia Giy Phe lie His Cys Pro Thr 
35 40 45 

Giu Asn Giu Pro Asp Leu Ala Gin Cys Phe Phe Cys Phe Lys Giu Leu 
50 ^"~ 55 — 60 

Giu Giy Trp Giu Pro Asp Asp Asp Pro lie Giu Giu His Lys Lys His 
65 * > V^ O ^ J- 75 " • 80 

Ser Ser Giy Cys jAla Phe Leu Ser Val Lys Lys Gin Phe Giu Giu Leu 
5 90 95 

Thr Leu Giy Giu Phe Leu/iLys Leu Asp Arg Giu Arg Ala Lys Asn Lys- 
100 // 105 . ^ . 110 

He Ala Lys Giu Thr Asn Asn Lys Lys Lys Giu Phe Giu Giu Thr Ala 
115 120 125 

- - - . 

Lys Lys Val Arg Arg Ala lie Giu Gin Leu Ala Ala Met Asp 
130 135 140 

(2) INFORMATION FOR SEQ ID NO: 35; 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 14796 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 35: 

TCTAGACATG CGGATATATT CAAGCTGGGC ACAGCACAGC AGCCCCACCC CAGGCAGCTT 60 

GAAATCAGAG CTGGGGTCCA AAGGGACCAC ACCCCGAGGG ACTGTGTGGG GGTCGGGGCA 120 

CACAGGCCAC TGCTTCCCCC CGTCTTTCTC AGCCATTCCT GAAGTCAGCC TCACTCTGCT 180 

TCTCAGGGAT TTCAAATGTG CAGAGACTCT GGCACTTTTG TAGAAGCCCC TTCTGGTCCT 24 0 

AACTTACACC TGGATGCTGT GGGGCTGCAG CTGCTGCTCG GGCTCGGGAG GATGCTGGGG 300 

GCCCGGTGCC CATGAGCTTT TGAAGCTCCT GGAACTCGGT TTTGAGGGTG TTCAGGTCCA 3 60 

GGTGGACACC TGGGCTGTCC TTGTCCATGC ATTTGATGAC ATTGTGTGCA GAAGTGAAAA 4 20 

GGAGTTAGGC CGGGCATGCT GGCTTATGCC TGTAATCCCA GCACTTTGGG AGGCTGAGGC 4 80 

GGGTGGATCA CGAGGTCAGG AGTTCAATAC CAGCCTGGCC AAGATGGTGA AACCCCGTCT 54 0 

CTACTAAAAA TACAAAAAAA TTAGCCGGGC ATGGTGGCGG GCGCATGTAA TCCCAGCTAC 600 

TGGGGGGGCT GAGGCAGAGA ATTGCTGGAA CCCAGGAGAT GGAGGTTGCA GTGAGCCAAG 660 



ATTGTGCCAC TGCACTGCAC TCCAGCCTGG CGACAGAGCA AGACTCTGTC TCAAAAAAAA 720 

AAAAAAAAAG TGAAAAGGAG TTGTTCCTTT CCTCCCTCCT GAGGGCAGGC AACTGCTGCG 730 

GTTGCCAGTG GAGGTGGTGC GTCCTTGGTC TGTGCCTGGG GGCCACCCCA GCAGAGGCCA 84 0 

TGGTGGTGCC AGGGCCCGGT TAGCGAGCCA ATCAGCAGGA CCCAGGGGCG ACCTGCCAAA 900 

GTCAACTGGA TTTGATAACT GCAGCGAAGT TAAGTTTCCT GATTTTGATG ATTGTGTTGT 960 

GGTTGTGTAA GAGAATGAAG TATTTCGGGG TAGTATGGTA ATGCCTTCAA CTTACAAACG 1020 

GTTCAGGTAA ACCACCCATA TACATACATA TACATGCATG TGATATATAC ACATACAGGG 1080 

ATGTGTGTGT GTTCACATAT ATGAGGGGAG AGAGACTAGG GGAGAGAAAG TAGGTTGGGG 1140 

AGAGGGAGAG AGAAAGGAAA ACAGGAGACA GAGAGAGAGC GGGGAGTAGA GAGAGGGAAG 1200 

GGGTAAGAGA GGGAGAGGAG GAGAGAAAGG GAGGAAGAAG CAGAGAGTGA ATGTTAAAGG 1260 

AAACAGGCAA AACATAAACA GAAAATCTGG GTGAAGGGTA TATGAGTATT CTTTGTACTA 1320 

TTCTTGCAAT TATCTTTTAT TTAAATTGAC ATCGGGCCGG GCGCAGTGGC TCACATCTGT 1380 

AATCCCAGCA CTTTGGGAGG CCGAGGCAGG CAGATCACTT GAGGTCAGGA GTTTGAGACC 14 4 0 

AGCCTGGCAA ACATGGTGAA ACCCCATCTC TACTAAAAAT ACAAAAATTA GCCTGGTGTG 1500 

GTGGTGCATG CCTTTAATCT CAGCTACTCG GGAGGCTGAG GCAGGAGAAT CGCTTGAACC 15 60 

CGTGGCGGGG AGGAGGTTGC AGTGAGCTGA GATCATGCCA CTGCACTCCA GCCTGGGCGA 1620 

TAGAGCGAGA CTCAGTTTCA AATAAATAAA TAAACATCAA AATAAAAAGT TACTGTATTA 168 0 

AAGAATGGGG GCGGGGTGGG AGGGGTGGGG AGAGGTTGCA AAAATAAATA AATAAATAAA 17 4 0 

TAAACCCCAA AATGAAAAAG ACAGTGGAGG CACCAGGCCT GCGTGGGGCT GGAGGGCTAA 1800 

TAAGGCCAGG CCTCTTATCT CTGGCCATAG AACCAGAGAA GTGAGTGGAT GTGATGCCCA 18 60 

GCTCCAGAAG TGACTCCAGA ACACCCTGTT CCAAAGCAGA GGACACACTG ATTTTTTTTT 1920 

TAATAGGCTG CAGGACTTAC TGTTGGTGGG ACGCCCTGCT TTGCGAAGGG AAAGGAGGAG 1980 

TTTGCCCTGA GCACAGGCCC CCACCCTCCA CTGGGCTTTC CCCAGCTCCC TTGTCTTCTT 204 0 

ATCACGGTAG TGGCCCAGTC CCTGGCCCCT GACTCCAGAA GGTGGCCCTC CTGGAAACCC 2100 

AGGTCGTGCA GTCAACGATG TACTCGCCGG GACAGCGATG TCTGCTGCAC TCCATCCCTC 2160 

CCCTGTTCAT TTGTCCTTCA TGCCCGTCTG GAGTAGATGC TTTTTGCAGA GGTGGCACCC 2220 

TGTAAAGCTC TCCTGTCTGA CTTTTTTTTT TTTTTTAGAC TGAGTTTTGC TCTTGTTGCC 2280 

TAGGCTGGAG TGCAATGGCA CAATCTCAGC TCACTGCACC CTCTGCCTCC CGGGTTCAAG 234 0 

CGATTCTCCT GCCTCAGCCT CCCGAGTAGT TGGGATTACA GGCATGCACC ACCACGCCCA 2 4 00 

GCTAATTTTT GTATTTTTAG TAGAGACAAG GTTTCACCGT GATGGCCAGG CTGGTCTTGA 2 4 60 

ACTCCAGGAC TCAAGTGATG CTCCTGCCTA GGCCTCTCAA AGTGTTGGGA TTACAGGCGT 2 520 
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GAGCCACTGC ACCCGGCCTG CACGCGTTCT TTGAAAGCAG TCGAGGGGGC GCTAGGTGTG 2 550 

GGCAGGGACG AGCTGGCGCG GCGTCGCTGG GTGCACCGCG ACCACGGGCA GAGCCACGCG 2 64 0 

GCGGGAGGAC TACAACTCCC GGCACACCCC GCGCCGCCCC GCCTCTACTC CCAGAAGGCC 2 7 00 

GCGGGGGGTG GACCGCCTAA GAGGGCGTGC GCTCCCGACA TGCCCCGCGG CGCGCCATTA 27 60 

ACCGCCAGAT TTGAATCGCG GGACCCGTTG GCAGAGGTGG CGGCGGCGGC ATGGGTGCCC 2 3 20 

CGACGTTGCC CCCTGCCTGG CAGCCCTTTC TCAAGGACCA CCGCATCTCT ACATTCAAGA 2880 

ACTGGCCCTT CTTGGAGGGC TGCGCCTGCA CCCCGGAGCG GGTGAGACTG CCCGGCCTCC 2 94 0 

TGGGGTCCCC CACGCCCGCC TTGCCCTGTC CCTAGCGAGG CCACTGTGAC TGGGCCTCGG 3000 
GGGTACAAGC CGCCCTCCCC TCCCCGTCCT GTCCCCAGCG AGGCCACTGT GGCTGGGCCC ' 3060 

CTTGGGTCCA GGCCGGCCTC CCCTCCCTGC TTTGTCCCCA TCGAGGCCTT TGTGGCTGGG 3120 

CCTCGGGGTT CCGGGCTGCC ACGTCCACTC ACGAGCTGTG CTGTCCCTTG CAGATGGCCG 3180 

AGGCTGGCTT CATCCACTGC CCCACTGAGA ACGAGCCAGA CTTGGCCCAG TGTTTCTTCT 32 4 0 

GCTTCAAGGA GCT.GGAAGGC TGGGAGCCAG ATGACGACCC CATGTAAGTC TTCTCTGGCC 3300 

AGCCTCGATG GGCTTTGTTT TGAACTGAGT TGTCAAAAGA TTTGAGTTGC AAAGACACTT 3360 

AGTATGGGAG GGTTGCTTTC CACCCTCATT GCTTCTTAAA CAGCTGTTGT GAACGGATAC 34 2 0 

CTCTCTATAT GCTGGTGCCT TGGTGATGCT TACAACCTAA TTAAATCTCA TTTGACCAAA 34 8 0 

ATGCCTTGGG GTGGACGTAA GATGCCTGAT GCCTTTCATG TTCAACAGAA TACATCAGCA 354 0 

GACCCTGTTG TTGTGAACTC CCAGGAATGT CCAAGTGCTT TTTTTGAGAT TTTTTAAAAA 3600 

ACAGTTTAAT TGAAATATAA CCTACACAGC ACAAAAATTA CCCTTTGAAA GTGTGCACTT 3660 

CACACTTTCG GAGGCTGAGG CGGGCGGATC ACCTGAGGTC AGGAGTTCAA GACCTGCCTG 37 20 

GCCAACTTGG CGAAACCCCG TCTCTACTAA AAATACAAAA ATTAGCCGGG CATGGTAGCG 37 8 0 

CACGCCCGTA ATCCCAGCTA CTCGGGAGGC TAAGGCAGGA GAATCGCTTG AACCTGGGAG 38 4 0 

GCGGAGGTTG CAGTGAGCCG AGATTGTGCC AATGCACTCC AGCCTCGGCG ACAGAGCGAG 3900 

ACTCCGTCAT AAAAATAAAA AATTGAAAAA AAAAAAAGAA AGAAAGCATA TACTTCAGTG 3960 

TTGTTCTGGA TTTTTTTCTT CAAGATGCCT AGTTAATGAC AATGAAATTC TGTACTCGGA 4020 

TGGTATCTGT CTTTCCACAC TGTAATGCCA TATTCTTTTC TCACCTTTTT TTCTGTCGGA 4080 

TTCAGTTGCT TCCACAGCTT TAATTTTTTT CCCCTGGAGA ATCACCCCAG TTGTTTTTCT 4140 

TTTTGGCCAG AAGAGAGTAG CTGTTTTTTT TCTTAGTATG TTTGCTATGG TGGTTATACT 4200 

GCATCCCCGT AATCACTGGG AAAAGATCAG TGGTATTCTT CTTGAAAATG AATAAGTGTT 4 260 

ATGATATTTT CAGATTAGAG TTACAACTGG CTGTCTTTTT GGACTTTGTG TGGCCATGTT 4 32 0 



TTCATTGTAA T^'AGTTCTG GTAACGGTGA TAGTCAGTTA TACAGGGAGA CTCCCCTAGC 4 380 

AGAAAATGAG AGTGTGAGCT AGGGGGTCCC TTGGGGAACC CGGGGCAATA ATGCCCTTCT 4 4 40 

CTGCCCTTAA TCCTTACAGT GGGCCGGGCA CGGTGGCTTA CGCCTGTAAT ACCAGCACTT 4 500 

TGGGAGGCCG AGGCGGGCGG ATCACGAGGT CAGGAGATCG AGACCATCTT GGCTAATACG 4 560 

GTGAAACCCC GTCTCCACTA AAAATACAAA AAATTAGCCG GGCGTGGTGG TGGGCGCCTG 4 620 

TAGTCCCAGC TACTCGGGAG GCTGAGGCAG GAGAATGGCG TGAACCCAGG AGGCGGAGCT 4 630 

TGCAGTGAGC CGAGATTGCA CCACTGCACT CCAGCCTGGG CGACAGAATG AGACTCCGTC 4 74 0 

TCAAAAAAAA AAAAAAAAGA AAAAAATCTT TACAGTGGAT TACATAACAA TTCCAGTGAA 4 800 

ATGAAATTAC TTCAAACAGT TCCTTGAGAA TGTTGGAGGG ATTTGACATG TAATTCCTTT 4 8 60 

GGACATATAC CATGTAACAC TTTTCCAACT AATTGCTAAG GAAGTCCAGA TAAAATAGAT 4 920 

ACATTAGCCA CACAGATGTG GGGGGAGATG TCCACAGGGA GAGAGAAGGT GCTAAGAGGT 4 980 

GCCATATGGG AATGTGGCTT GGGCAAAGCA CTGATGCCAT CAACTTCAGA CTTGACGTCT 504 0 

TACTCCTGAG GCAGAGCAGG GTGTGCCTGT GGAGGGCGTG GGGAGGTGGC CCGTGGGGAG 5100 

TGGACTGCCG CTTTAATCCC TTCAGCTGCC TTTCCGCTGT TGTTTTGATT TTTCTAGAGA 5160 

GGAACATAAA AAGCATTCGT CCGGTTGCGC TTTCCTTTCT GTCAAGAAGC AGTTTGAAGA 5220 

ATTAACCCTT GGTGAATTTT TGAAACTGGA CAGAGAAAGA GCCAAGAACA AAATTGTATG 5280 

TATTGGGAAT AAGAACTGCT CAAACCCTGT TCAATGTCTT TAGCACTAAA CTACCTAGTC 534 0 

CCTCAAAGGG ACTCTGTGTT TTCCTCAGGA AGCATTTTTT TTTTTTTTCT G AG AT AG AG T 54 00 

TTCACTCTTG TTGCCCAGGC TGGAGTGCAA TGGTGCAATC TTGGCTCACT GCAACCTCTG 54 60 

CCTCTCGGGT TCAAGTGATT CTCCTGCCTC AGCCTCCCAA GTAACTGGGA TTACAGGGAA 5520 

GTGCCACCAC ACCCAGCTAA TTTTTGTATT TTTAGTAGAG ATGGGGTTTC ACCACATTGC 558 0 

CCAGGCTGGT CTTGAACTCC TGACCTCGTG ATTCGCCCAC CTTGGCCTCC CAAAGTGCTG 564 0 

GGATTACAGG CGTGAACCAC CACGCCTGGC TTTTTTTTTT TTGTTCTGAG ACACAGTTTC 5700 
ACTCTGTTAC CCAGGCTGGA GTAGGGTGGC CTGATCTCGG ATCACTGCAA CCTCCGCCTC 57 60 
CTGGGCTCAA GTGATTTGCC TGCTTCAGCC TCCCAAGTAG CCGAGATTAC AGGCATGTGC 5820 
CACCACACCC AGGTAATTTT TGTATTTTTG GTAGAGACGA GGTTTCACCA TGTTGGCCAG 58 8 0 
GCTGGTTTTG AACTCCTGAC CTCAGGTGAT CCACCCGCCT CAGCCTCCCA AAGTGCTGAG 594 0 
ATTATAGGTG TGAGCCACCA CACCTGGCCT CAGGAAGTAT TTTTATTTTT AAATTTATTT 6000 
ATTTATTTGA GATGGAGTCT TGCTCTGTCG CCCAGGCTAG AGTGCAGCGA CGGGATCTCG 6060 
GCTCACTGCA AGCTCCGCCC CCCAGGTTCA AGCCATTCTC CTGCCTCAGC CTCCCGAGTA 6120 
GCTGGGACTA CAGGCGCCCG CCACCACACC CGGCTAATTT TTTTGTATTT TTAGTAGAGA 6180 



CGGGTTTTCA CCGTGTTAGC CAGGAGGGTC TTGATCTCCT GACCTCGTGA TCTGCCTGCC 524 0 

TCGGCCTCCC AAAGTGCTGG GATTACAGGT GTGAGCCACC ACACCCGGCT ATTTTTATTT 6300 

TTTTGAGACA GGGACTCACT CTGTCACCTG GGCTGCAGTG CAGTGGTACA CCATAGCTCA 63 60 

CTGCAGCCTC GAACTCCTGA GCTCAAGTGA TCCTCCCACC TCATCCTCAC AAGTAATTGG 64 2 0 

GACTACAGGT GCACCCCACC ATGCCCACCT AATTTATTTA TTTATTTATT TATTTATTTT 64 80 

CATAGAGATG AGGGTTCCCT GTGTTGTCCA GGCTGGTCTT GAACTCCTGA GCTCACGGGA 65 4 0 

TCCTTTTGCC TGGGCCTCCC AAAGTGCTGA GATTACAGGC ATGAGCCACC GTGCCCAGCT 6600 

AGGAATCATT TTTAAAGCCC CTAGGATGTC TGTGTGATTT TAAAGCTCCT GGAGTGTGGC 6660 

CGGTATAAGT ATATACCGGT ATAAGTAAAT CCCACATTTT GTGTCAGTAT TTACTAGAAA 6720 

CTTAGTCATT TATCTGAAGT TGAAATGTAA CTGGGCTTTA TTTATTTATT TATTTATTTA 67 80 

TTTATTTTTA ATTTTTTTTT TTGAGACGAG TCTCACTTTG TCACCCAGGC TGGAGTGCAG 68 4 0 

TGGCACGATC TCGGCTCACT GCAACCTCTG CCTCCCGGGG TCAAGCGATT CTCCTGCCTT 6900 

AGCCTCCCGA GTAGCTGGGA CTACAGGCAC GCACCACCAT GCCTGGCTAA TTTTTGTATT 6960 

TTTAGTAGAC GGGGTTTCAC CATGCTGGCC AAGCTGGTCT CAAACTCCTG ACCTTGTGAT 7020 

CTGCCCGCTT TAGCCTCCCA GAGTGCTGGG ATTACAGGCA TGAGCCACCA TGCGTGGTCT 7 080 

TTTTAAAATT TTTTGATTTT TTTTTTTTTT G AG AC AG AG C CTTGCTCTGT CGCCCAGGCT 7140 

GGAGTGCAGT GGCACGATCT CAGCTCACTA CAAGCTCCGC CTCCCGGGTT CACGCCATTC 7 200 

TTCTGCCTCA GCCTCCTGAG TAGCTGGGAC TACAGGTGCC CACCACCACG CCTGGCTAAT 7 2 60 

TTTTTTTGGT ATTTTTATTA GAGACAAGGT TTCATCATGT TGGCCAGGCT GGTCTCAAAC 7 320 

TCCTGACCTC AAGTGATCTG CCTGCCTCGG CCTCCCAAAG CGCTGAGATT ACAGGTGTGA 7 38 0 

TCTACTGCGC CAGGCCTGGG CGTCATATAT TCTTATTTGC TAAGTCTGGC . AGCCCCACAC 7 4 40 

AGAATAAGTA CTGGGGGATT CCATATCCTT GTAGCAAAGC CCTGGGTGGA GAGTCAGGAG 7 500 

ATGTTGTAGT TCTGTCTCTG CCACTTGCAG ACTTTGAGTT TAAGCCAGTC GTGCTCATGC 7 5 60 

TTTCCTTGCT AAATAGAGGT TAGACCCCCT ATCCCATGGT TTCTTCAGGTT GCTTTTCAGC 7 620 

TTGAAAATTG TATTCCTTTG TAGAGATCAG CGTAAAATAA TTCTGTCCTT ATATGTGGCT 7 68 0 

TTATTTTAAT TTGAGACAGA GTGTCACTCA GTCGCCCAGG CTGGAGTGTG GTGGTGCGAT 77 4 0 

CTTGGCTCAC TGCGACCTCC ACCTCCCAGG TTCAAGCGAT TCTCGTGCCT CAGGCTCCCA 7 800 

AGTAGCTGAG ATTATAGGTG TGTGCCACCA GGCCCAGCTA ACTTTTGTAT TTTTAGTAGA 7 3 60 

GACAGGGTTT TGCCATGTTG GCTAAGCTGG TCTCGAACTC CTGGCCTCAA GTGATCTGCC 7 920 

CGCCTTGGCA TCCCAAAGTG CTGGGATTAC AGGTGTGAAC CACCACACCT GGCCTCAATA 7 93 0 



TAGTGGCTTT TAAGTGCTAA GGACTGAGAT TGTGTTTTGT CAGGAAGAGG CCAGTTGTGG 304 0 

GTGAAGCATG CTGTGAGAGA GCTTGTCACC TGGTTGAGGT TGTGGGAGCT GCAGCGTGGG 3100 

AACTGGAAAG TGGGCTGGGG ATCATCTTTT TCCAGGTCAG GGGTCAGCCA GCTTTTCTGC 3160 

AGCGTGCCAT AGACCATCTC TTAGCCCTCG TGGGTCAGAG TCTCTGTTGC ATATTGTCTT 3220 

TTGTTGTTTT TCACAACCTT TTAGAAACAT AAAAAGCATT CTTAGCCCGT GGGCTGGACA 828 0 

AAAAAAGGCC ATGACGGGCT GTATGGATTT GGCCCAGCAG GCCCTTGCTT GCCAAGCCCT 8 34 0 

GTTTTAGACA AGGAGCAGCT TGTGTGCCTG GAACCATCAT GGGCACAGGG GAGGAGCAGA 84 00 

GTGGATGTGG AGGTGTGAGC TGGAAACCAG GTCCCAGAGC GCTGAGAAAG ACAGAGGGTT 3 4 60 

TTTGCCCTTG CAAGTAGAGC AACTGAAATC TGACACCATC CAGTTCCAGA AAGCCCTGAA -8 520 

GTGCTGGTGG ACGCTGCGGG GTGCTCCGCT CTAGGGTTAC AGGGATGAAG ATGCAGTCTG 3 580 

GTAGGGGGAG TCCACTCACC TGTTGGAAGA TGTGATTAAG AAAAGTAGAC TTTCAGGGCC 8 64 0 

GGGCATGGTG GCTCACGCCT GTAATCCCAG CACTTTGGGA GGCCGAGGCG GGTGGATCAC 87 00 

GAGGTCAGGA GATCGAGACC ATCCTGGCTA ACATGGTGAA ACCCCGTCTT TACTAAAAAT 87 60 

ACAAAAAATT AGCTGGGCGT GGTGGCGGGC GCCTGTAGTC CCAGCTACTC GGGAGGCTGA 8820 

GGCAGGAGAA TGGCGTGAAC CTGGGAGGTG GAGCTTGCTG TGAGCCGAGA TCGCGCCACT 8880 

GCACTCCAGC CTGGGCGACA GAGCGAGACT CCGTCTCAAA AAAAAAAAAA AAAGTAGGCT 8 94 0 

TTCATGATGT GTGAGCTGAA GGCGCAGTAG GCAGAAGTAG AGGCCTCAGT CCCTGCAGGA 9000 

GACCCCTCGG TCTCTATCTC CTGATAGTCA GACCCAGCCA CACTGGAAAG AGGGGAGACA 9060 

TTACAGCCTG CGAGAAAAGT AGGGAGATTT AAAAACTGCT TGGCTTTTAT TTTGAACTGT 9120 

TTTTTTTGTT TGTTTGTTTT CCCCAATTCA GAATACAGAA TACTTTTATG GATTTGTTTT 9180 

TATTACTTTA ATTTTGAAAC AATATAATCT TTTTTTTGTT GTTTTTTTGA GACAGGGTCT 924 0 

TACTCTGTCA CCCAGGCTGA GTGCAGTGGT GTGATCTTGG CTCACCTCAG CCTCGACCCC 9300 

CTGGGCTCAA ATGATTCTCC CACCTCAGCT TCCCAAGTAG CTGGGACCAC AGGTGCGTGT 9360 

GTTGCGCTAT ACAAATCCTG AAGACAAGGA TGCTGTTGCT GGTGATGCTG GGGATTCCCA 94 20 

AGATCCCAGA TTTGATGGCA GGATGCCCCT GTCTGCTGCC TTGCCAGGGT GCCAGGAGGG 94 30 

CGCTGCTGTG GAAGCTGAGG CCCGGCCATC CAGGGCGATG CATTGGGCGC TGATTCTTGT 954 0 

TCCTGCTGCT GCCTCGGTGC TTAGCTTTTG AAACAATGAA ATAAATTAGA ACCAGTGTGA 9600 

AAATCGATCA GGGAATAAAT TTAATGTGGA AATAAACTGA ACAACTTAGT TCTTCATAAG 9660 

AGTTTACTTG GTAAATACTT GTGATGAGGA CAAAACGAAG CACTAGAAGG AGAGGCGAGT 97 20 

TGTAGACCTG GGTGGCAGGA GTGTTTTGTT TGTTTTCTTT GGCAGGGTCT TGCTCTGTTG 97 30 

CTCAGGCTGG AGTACAGTGG CACAATCACA GCTCACTATA GCCTCGACCT CCTGGACTCA 98 4 0 



94 

AGCAATCCTC CTGCCTCAGC CTCCCAGTAG CTGGGACTAC AGGCGCATGC CACCATGCCT 9900 

GGCTAATTTT AAATTTTTTT TTTTCTCTTT TTTGAGATGG AATCTCACTC TGTCGCCCAG 9960 

GCTGGAGTGC AGTGGCGTGA TCTCGGCTGA CGGCAAGCTC CGCCTCCCAG GTTCACTCCA 10020 

TTCGCCTGCC TCAGCCTCCC AAGTAGCTGG GACTACAGGC GCTGGGATTA CAAACCCAAA 10080 

CCCAAAGTGC TGGGATTACA GGCGTGAGCC ACTGCACCCG GCCTGTTTTG TCTTTCAATA 1014 0 

GCAAGAGTTG TGTTTGCTTC GCCCCTACCT TTAGTGGAAA AATGTATAAA ATGGAGATAT 10200 

TGACCTCCAC ATTGGGGTGG TTAAATTATA GCATGTATGC AAAGGAGCTT CGCTAATTTA 102 60 

AGGCTTTTTT GAAAGAGAAG AAACTGAATA ATCCATGTGT GTATATATAT TTTAAAAGCC 10320 

ATGGTCATCT TTCCATATCA GTAAAGCTGA GGCTCCCTGG GACTGCAGAG TTGTCCATCA 10330 

CAGTCCATTA TAAGTGCGCT GCTGGGCCAG GTGCAGTGGC TTGTGCCTGA ATCCCAGCAC 10440 

TTTGGGAGGC CAAGGCAGGA GGATTCATTG AGCCCAGGAG TTTTGAGGCG AGCCTGGGCA 10500 

ATGTGGCCAG ACCTCATCTC TTCAAAAAAT ACACAAAAAA TTAGCCAGGC ATGGTGGCAC 105 60 

GTGCCTGTAG TCTCAGCTAC TCAGGAGGCT GAGGTGGGAG GATCACTTTG AGCCTTGCAG 10 620 

GTCAAAGCTG CAGTAAGCCA TGATCTTGCC ACTGCATTCC AGCCTGGATG ACAGAGCGAG 10680 

ACCCTGTCTC TAAAAAAAAA AAAAACCAAA CGGTGCACTG TTTTCTTTTT TCTTATCAAT 10740 

TTATTATTTT TAAATTAAAT TTTCTTTTAA TAATTTATAA ATTATAAATT TATATTAAAA 10800 

AATGACAAAT TTTTATTACT TATACATGAG GTAAAACTTA GGATATATAA AGTACATATT 108 60 

GAAAAGTAAT TTTTTGGCTG GCACAGTGGC TCACACCTGT AATCCCAGCA CTTTGGGAGG 10 920 

CCGTGGCGGG CAGATCACAT GAGATCATGA GTTCGAGACC AACCTGACCA ACATGGAGAG 10980 

ACCCCATCTC TACTAAAAAT ACAAAATTAG CCGGGGTGGT GGCGCATGCC TGTAATCCCA 11040 

GCTACTCGGG AGGCTGAGGC AGGAGAATCT CTTGAACCCG G GAG GC AG AG GTTGCGGTGA 11100 

GCCAAGATCG TGCCTTTGCA CACCAGCCTA GGCAACAAGA GCGAAAGTCC GTCTCAAAAA 11160' 

AAAAGTAATT TTTTTTAAGT TAACCTCTGT CAGCAAACAA ATTTAACCCA ATAAAGGTCT 11220 

TTGTTTTTTA ATGTAGTAGA GGAGTTAGGG TTTATAAAAA ATATGGTAGG GAAGGGGGTC 11280 

CCTGGATTTG CTAATGTGAT TGTCATTTGC CCCTTAGGAG AGAGCTCTGT TAGCAGAATG 11340 

AAAAAATTGG AAGCCAGATT CAGGGAGGGA CTGGAAGCAA AAGAATTTCT GTTCGAGGAA 11400 

GAGCCTGATG TTTGCCAGGG TCTGTTTAAC TGGACATGAA GAGGAAGGCT CTGGACTTTC 114 60 

CTCCAGGAGT TTCAGGAGAA AGGTAGGGCA GTGGTTAAGA GCAGAGCTCT GCCTAGACTA 11520 

GCTGGGGTGC CTAGACTAGC TGGGGTGCCC AGACTAGCTG GGGTGCCTAG ACTAGCTGGG 11580 

TACTTTGAGT GGCTCCTTCA GCCTGGACCT CGGTTTCCTC ACCTGTATAG TAGAGATATG 11640 
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GGAGCACCCA 


GCGCAGGATC 


ACTGTGAACA 


TAAATCAGTT 


AATGGAGGAA 


GCAGGTAGAG 


- L ~l 0 0 


TGGTGCTGGG 


TGCATACCAA 


GCACTCCGTC 


AGTGTTTCCT 


GTTATTCGAT 


GATTAGGAGG 


11760 


CAGCTTAAAC 


TAGAGGGAGT 


TGAGCTGAAT 


CAJGATGTTT 


GTCCCAGGTA 


GCTGGGAATC 


11320 


TGCCTAGCCC 


AGTGCCCAGT 


TTATTTAGGT 


GCTCTCTCAG 


TGTTCCCTGA 


TTGTTTTTTC 


11380 


CTTTGTCATC 


TTATCTACAG 


GATGTGACTG 


GGAAGCTCTG 


GTTTCAGTGT 


CATGTGTCTA 


11940 


TTCTTTATTT 


CCAGGCAAAG 


GAAACCAACA 


ATAAGAAGAA 


AGAATTTGAG 


GAAACTGCGA 


12000 


AGAAAGTGCG 


CCGTGCCATC 


GAGCAGCTGG 


CTGCCATGGA 


TTGAGGCCTC 


TGGCCGGAGC 


12060 


TGCCTGGTCC 


CAGAGTGGCT 


GCACCACTTC 


CAGGGTTTAT 


TCCCTGGTGC 


CACCAGCCTT 


12120 


CCTGTGGGCC 


CCTTAGCAAT 


GTCTTAGGAA 


AGGAGATCAA 


CATTTTCAAA 


TTAGATGTTT 


12180 


CAACTGTGCT 


CCTGTTTTGT 


CTTGAAAGTG 


GCACCAGAGG 


TGCTTCTGCC 


TGTGCAGCGG 


12240 


GTGCTGCTGG 


TAACAGTGGC 


TGCTTCTCTC 


TCTCTCTCTC 


TTTTTTGGGG 


GCTCATTTTT 


12300 


GCTGTTTTGA 


TTCCCGGGCT 


TACCAGGTGA 


GAAGT GAG GG 


AGGAAGAAGG 


CAGTGTCCCT 


12360 


TTTGCTAGAG 


CTGACAGCTT 


TGTTCGCGTG 


GGCAGAGCCT 


TCCACAGTGA 


ATGTGTCTGG 


124 20 


•ACCTCATGTT 


GTTGAGGCTG 


TCACAGTCCT 


GAGTGTGGAC 


TTGGCAGGTG 


CCTGTTGAAT 


12480 


CTGAGCTGCA 


GGTTCCTTAT 


CTGTCACACC 


TGTGCCTCCT 


CAGAGGACAG 


TTTTTTTGTT 


12540 


GTTGTGTTTT 


TTTGTTTTTT 


TTTTTTGGTA 


GATGCATGAC 


TTGTGTGTGA 


TGAGAGAATG 


12600 


G AG AC AG AG T 


CCCTGGCTCC 


TCTACTGTTT 


AACAACATGG 


CTTTCTTATT 


TTGTTTGAAT 


12 660 


TGTTAATTCA 


CAGAATAGCA 


CAAACTACAA 


TTAAAACTAA 


GCACAAAGCC 


ATTCTAAGTC 


12720 


ATTGGGGAAA 


CGGGGTGAAC 


TTCAGGTGGA 


TGAGGAGACA 


GAATAGAGTG 


ATAGGAAGCG 


12780 


TCTGGCAGAT 


ACTCCTTTTG 


CCACTGCTGT 


GTGATTAGAC 


AGGCCCAGTG 


AGCCGCGGGG 


12840 


CACATGCTGG 


CCGCTCCTCC 


CTCAGAAAAA 


GGCAGTGGCC 


TAAATCCTTT 


TTAAATGACT 


1 'i n a r\ 

12 900 


TGGCTCGATG 


CTGTGGGGGA 


CTGGCTGGGC 


TGCTGCAGGC 


CGTGTGTCTG 


TCAGCCCAAC 


12 960 


CTTCACATCT 


GTCACGTTCT 


CCACACGGGG 


GAGAGACGCA 


GTCCGCCCAG 


GTCCCCGCTT 


13020 


TCTTTGGAGG 


CAGCAGCTCC 


CGCAGGGCTG 


AAGTCTGGCG 


TAAGATGATG 


GATTTGATTC 


13080 


GCCCTCCTCC 


CTGTCATAGA 


GCTGCAGGGT 


GGATTGTTAC 


AGCTTCGCTG 


GAAACCTCTG 


13140 


GAGGTCATCT 


CGGCTGTTCC 


TGAGAAATAA 


AAAGCCTGTC 


ATTTCAAACA 


CTGCTGTGGA 


13200 


CCCTACTGGG 


TTTTTAAAAT 


ATTGTCAGTT 


TTTCATC-GTC 


GTCCCTAGCC 


TGCCAACAUL 


1 J 2 OU 


CATCTGCCCA 


GACAGCCGCA 


GTGAGGATGA 


GCGTCCTGGC 


AGAGACGCAG 


TTGTCTCTGG 


1 "3 "5 o n 
1 J J2 U 


GCGCTTGCCA 


GAGCCACGAA 


CCCCAGACCT 


GTTTGTATCA 


TCCGGGCTCC 


TTCCGGGCAG 


13380 


AAACAACTGA 


AAATGCACTT 


CAGACCCACT 


TATTTATGCC 


ACATCTGAGT 


CGGCCTGAGA 


13440 


TAGACTTTTC 


CCTCTAAACT 


GGGAGAATAT 


CACAGTGGTT 


TTTGTTAGCA 


GAAAATGCAC 


13500 



aagctgctta TmtGATAT ttgtgtcagt ctgtaaatgg >-35 

TCCAGCCTCT GTACTCATCT AAGC T GG T T ^ „« 
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CCTTAAAGGC CATCCTTAAA ctargc gcAA GCGCCTAGAC TTTGTT1GAG H 

ACATCTGTTA ATAAAGCCG, AGGCCCTTGT ^ 

„CC ACATGTCCAT ^ „G CCCCTCGGGC > 

TGTGAATGAG GCTTCTGGGC ~ rcc „ , 
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ATCACACCCG GC.AA.GAA qgTTCAGCGT CCGACTTGTT 

c „ CGAAACXCC GCAGAAATGA 

„AG " ^ Q „ TAAGTAACTT TTAGAGCTGG 

— *— ^AGAG CTGGGCCCTC ACTGCTGAAG GACACTGTCA 

gatttgaacc caggcaatct g^g gtctttwrg 

GCTTGGGAGG GTGGCTATGG ^ CTGACACCTG 

ccgcca^g ca^caga qcttttattt tgaaatgaaa 

CCTCCCCAAG GCTTCCATAG ATCCTCTCT 
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